{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "-74XLLwqPlcw"
   },
   "source": [
    "# Ungraded Lab: Preprocessing Images to Train a Neural Network\n",
    "\n",
    "Real world image datasets might not come as neatly packaged as the ones you've been using so far. Oftentimes, you will need to prepare them first before they can be fed to your model efficiently. You will see how to do that in this lab.\n",
    "\n",
    "In particular, you will:\n",
    "* build and train a model on the [Horses or Humans](https://www.tensorflow.org/datasets/catalog/horses_or_humans) dataset. This contains over a thousand images of horses and humans with varying poses and filesizes.\n",
    "*  preprocess these different images using the [image_dataset_from_directory](https://www.tensorflow.org/api_docs/python/tf/keras/utils/image_dataset_from_directory) utility\n",
    "* use the [Rescaling](https://www.tensorflow.org/api_docs/python/tf/keras/layers/Rescaling) layer to normalize the images\n",
    "* chain different methods of the [tf.data.Dataset](https://www.tensorflow.org/api_docs/python/tf/data/Dataset) to configure it for performance\n",
    "\n",
    "Finally, you will upload images from the internet to see how your model distinguishes between horses and humans. Let's begin!\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "qYFguQkJvpV3"
   },
   "source": [
    "## Import the Libraries\n",
    "\n",
    "Start with importing all the libraries you will need."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import os\n",
    "import random\n",
    "import numpy as np\n",
    "from io import BytesIO\n",
    "\n",
    "# Plotting and dealing with images\n",
    "import matplotlib.pyplot as plt\n",
    "import matplotlib.image as mpimg\n",
    "\n",
    "import tensorflow as tf\n",
    "\n",
    "# Interactive widgets\n",
    "from ipywidgets import widgets"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "qYFguQkJvpV3"
   },
   "source": [
    "## Preview the Dataset\n",
    "\n",
    "The dataset is already downloaded in your lab environment. It's on the base directory `horse-or-human`, which in turn contains `horses` and `humans` subdirectories. By arranging the data this way, you do not need to explicitly label the images as horses or humans. The utility you will use later automatically labels images according to this directory structure."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "PLy3pthUS0D2"
   },
   "outputs": [],
   "source": [
    "TRAIN_DIR = 'horse-or-human'\n",
    "\n",
    "# You should see a `horse-or-human` folder here\n",
    "print(f\"files in current directory: {os.listdir()}\")\n",
    "\n",
    "# Check the subdirectories\n",
    "print(f\"\\nsubdirectories within '{TRAIN_DIR}' dir: {os.listdir(TRAIN_DIR)}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "LuBYtA_Zd8_T"
   },
   "source": [
    "Now see what the filenames look like in the `horses` and `humans` training directories:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "4PIP1rkmeAYS"
   },
   "outputs": [],
   "source": [
    "# Directory with the training horse pictures\n",
    "train_horse_dir = os.path.join(TRAIN_DIR, 'horses')\n",
    "\n",
    "# Directory with the training human pictures\n",
    "train_human_dir = os.path.join(TRAIN_DIR, 'humans')\n",
    "\n",
    "# Check the filenames\n",
    "train_horse_names = os.listdir(train_horse_dir)\n",
    "print(f\"5 files in horses subdir: {train_horse_names[:5]}\")\n",
    "train_human_names = os.listdir(train_human_dir)\n",
    "print(f\"5 files in humans subdir:{train_human_names[:5]}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "HlqN5KbafhLI"
   },
   "source": [
    "You can also find out the total number of images in each directory:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "H4XHh2xSfgie"
   },
   "outputs": [],
   "source": [
    "print(f\"total training horse images: {len(os.listdir(train_horse_dir))}\")\n",
    "print(f\"total training human images: {len(os.listdir(train_human_dir))}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "C3WZABE9eX-8"
   },
   "source": [
    "Take a look at a few pictures to get a better sense of what they look like. Display a batch of 8 horse and 8 human pictures. You can rerun the cell to see a fresh batch each time:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "Wpr8GxjOU8in"
   },
   "outputs": [],
   "source": [
    "# Parameters for your graph; you will output images in a 4x4 configuration\n",
    "nrows = 4\n",
    "ncols = 4\n",
    "\n",
    "# Set up matplotlib fig, and size it to fit 4x4 pics\n",
    "fig = plt.gcf()\n",
    "fig.set_size_inches(ncols * 3, nrows * 3)\n",
    "\n",
    "next_horse_pix = [os.path.join(train_horse_dir, fname)\n",
    "                for fname in random.sample(train_horse_names, k=8)]\n",
    "next_human_pix = [os.path.join(train_human_dir, fname)\n",
    "                for fname in random.sample(train_human_names, k=8)]\n",
    "\n",
    "for i, img_path in enumerate(next_horse_pix + next_human_pix):\n",
    "    # Set up subplot; subplot indices start at 1\n",
    "    sp = plt.subplot(nrows, ncols, i + 1)\n",
    "    sp.axis('Off') # Don't show axes (or gridlines)\n",
    "\n",
    "    img = mpimg.imread(img_path)\n",
    "    plt.imshow(img)\n",
    "\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "5oqBkNBJmtUv"
   },
   "source": [
    "## Building a Small Model from Scratch\n",
    "\n",
    "Now you will define the model architecture. You will add convolutional layers as in last week's labs, and flatten the final result to feed into the densely connected layers. Note that because this is a two-class classification problem (i.e. a *binary classification problem*) you will end your network with a [sigmoid activation](https://wikipedia.org/wiki/Sigmoid_function). This makes the output value of your network to be a number between 0 and 1. The closer it is to 0, the more likely that the image is a horse. The closer it is to 1, the more likely the image is a human."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "PixZ2s5QbYQ3"
   },
   "outputs": [],
   "source": [
    "model = tf.keras.models.Sequential([\n",
    "    # Note the input shape is the desired size of the image 300x300 with 3 bytes color\n",
    "    # This is the first convolution\n",
    "    tf.keras.Input(shape=(300, 300, 3)),\n",
    "    tf.keras.layers.Conv2D(16, (3,3), activation='relu'),\n",
    "    tf.keras.layers.MaxPooling2D(2, 2),\n",
    "    # The second convolution\n",
    "    tf.keras.layers.Conv2D(32, (3,3), activation='relu'),\n",
    "    tf.keras.layers.MaxPooling2D(2,2),\n",
    "    # The third convolution\n",
    "    tf.keras.layers.Conv2D(64, (3,3), activation='relu'),\n",
    "    tf.keras.layers.MaxPooling2D(2,2),\n",
    "    # The fourth convolution\n",
    "    tf.keras.layers.Conv2D(64, (3,3), activation='relu'),\n",
    "    tf.keras.layers.MaxPooling2D(2,2),\n",
    "    # The fifth convolution\n",
    "    tf.keras.layers.Conv2D(64, (3,3), activation='relu'),\n",
    "    tf.keras.layers.MaxPooling2D(2,2),\n",
    "    # Flatten the results to feed into a DNN\n",
    "    tf.keras.layers.Flatten(),\n",
    "    # 512 neuron hidden layer\n",
    "    tf.keras.layers.Dense(512, activation='relu'),\n",
    "    # Only 1 output neuron. It will contain a value from 0 to 1 where 0 is for 'horses' and 1 for 'humans'\n",
    "    tf.keras.layers.Dense(1, activation='sigmoid')\n",
    "])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "s9EaFDP5srBa"
   },
   "source": [
    "You can review the network architecture and the output shapes with `model.summary()`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "7ZKj8392nbgP"
   },
   "outputs": [],
   "source": [
    "model.summary()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "DmtkTn06pKxF"
   },
   "source": [
    "The \"output shape\" column shows how the size of your feature map evolves in each successive layer. As you saw in an earlier lesson, the convolution layers removes the outermost pixels of the image, and each pooling layer halves the dimensions."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "PEkKSpZlvJXA"
   },
   "source": [
    "Next, you'll configure the specifications for model training. You will train the model with the [`binary_crossentropy`](https://www.tensorflow.org/api_docs/python/tf/keras/losses/BinaryCrossentropy) loss because it's a binary classification problem, and the final activation is a sigmoid. (_For a refresher on loss metrics, see this [Machine Learning Crash Course](https://developers.google.com/machine-learning/crash-course/descending-into-ml/video-lecture)._) You will use the `RMSProp` optimizer with a learning rate of `0.001`. During training, you will want to monitor classification accuracy.\n",
    "\n",
    "_**NOTE**: In this case, using the [RMSprop optimization algorithm](https://wikipedia.org/wiki/Stochastic_gradient_descent#RMSProp) is preferable to [stochastic gradient descent](https://developers.google.com/machine-learning/glossary/#SGD) (SGD), because RMSprop automates learning-rate tuning for us. (Other optimizers, such as [Adam](https://wikipedia.org/wiki/Stochastic_gradient_descent#Adam) and [Adagrad](https://developers.google.com/machine-learning/glossary/#AdaGrad), also automatically adapt the learning rate during training, and would work equally well here.)_"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "8DHWhFP_uhq3"
   },
   "outputs": [],
   "source": [
    "model.compile(loss='binary_crossentropy',\n",
    "              optimizer=tf.keras.optimizers.RMSprop(learning_rate=0.001),\n",
    "              metrics=['accuracy'])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "Sn9m9D3UimHM"
   },
   "source": [
    "### Data Preprocessing\n",
    "\n",
    "Next is to prepare the dataset so it can be consumed by the model efficiently while training. To do that, you will use the [`image_from_dataset_directory`](https://www.tensorflow.org/api_docs/python/tf/keras/utils/image_dataset_from_directory) utility to read pictures in the source folders, convert them to tensors, and combine them with their labels to form a [`tf.data.Dataset`](https://www.tensorflow.org/api_docs/python/tf/data/Dataset). This uses the [tf.data API](https://www.tensorflow.org/guide/data) which is optimized for parallel processing such as feeding data to GPUs and TPUs. It makes the training much faster than when using regular Numpy arrays.\n",
    "\n",
    "To use this utility, you only need to set the base directory. You can set additional arguments if you want to modify the defaults. In this case, you will set the resizing dimensions of the images, the batch size, and the type of labelling to apply. By default, the labels are assigned alphabetically. In this case, the said utility will assign the label `0` to the images in the `horses` directory and the label `1` to those in `humans`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "X4C_NGXTCzcY"
   },
   "outputs": [],
   "source": [
    "# Instantiate the dataset\n",
    "train_dataset = tf.keras.utils.image_dataset_from_directory(\n",
    "    TRAIN_DIR,\n",
    "    image_size=(300, 300),\n",
    "    batch_size=32,\n",
    "    label_mode='binary'\n",
    "    )\n",
    "\n",
    "# Check the type\n",
    "dataset_type = type(train_dataset)\n",
    "print(f'train_dataset inherits from tf.data.Dataset: {issubclass(dataset_type, tf.data.Dataset)}')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "7yayBg8O3TgX"
   },
   "source": [
    "You can examine one example to inspect how the data is structured. The cell below gets one batch of data using the [`take()`](https://www.tensorflow.org/api_docs/python/tf/data/Dataset#take) method and extracts the data. This should be a tuple of 2 elements which corresponds to an `(image, label)` pair. You can examine it further and confirm by checking the shape of the elements. The first should have `(300,300,3)` corresponding to the images and the other should have `(1)` for the single number labels. Both will have an additional `128` in front for the batch size set earlier."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "Sfs_aDTi1eSA"
   },
   "outputs": [],
   "source": [
    "# Get one batch from the dataset\n",
    "sample_batch = list(train_dataset.take(1))[0]\n",
    "\n",
    "# Check that the output is a pair\n",
    "print(f'sample batch data type: {type(sample_batch)}')\n",
    "print(f'number of elements: {len(sample_batch)}')\n",
    "\n",
    "# Extract image and label\n",
    "image_batch = sample_batch[0]\n",
    "label_batch = sample_batch[1]\n",
    "\n",
    "# Check the shapes\n",
    "print(f'image batch shape: {image_batch.shape}')\n",
    "print(f'label batch shape: {label_batch.shape}')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "oRZL0RfX_3ZC"
   },
   "source": [
    "You can also preview the image array so you can compare the pixel values later in the next step of the preprocessing."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "KQWFhhTLN-sB"
   },
   "outputs": [],
   "source": [
    "print(image_batch[0].numpy())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "bxT3YuiRbjQ0"
   },
   "source": [
    "Getting the minimum and maximum pixel values of the random image can give an estimate of the range present in the dataset. You should expect 0 to 255 but the random image you got may not have these extremes."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "vmCkT4CqBbtK"
   },
   "outputs": [],
   "source": [
    "# Check the range of values\n",
    "print(f'max value: {np.max(image_batch[0].numpy())}')\n",
    "print(f'min value: {np.min(image_batch[0].numpy())}')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "gZzChUZK1XFh"
   },
   "source": [
    "As you may already know, data that goes into neural networks should usually be normalized in some way to make it more amenable to processing by the network (i.e. It is uncommon to feed raw pixels into a ConvNet.) In this case, you will preprocess the images by normalizing the pixel values to be in the `[0, 1]` range.\n",
    "\n",
    "In Tensorflow, you can use preprocessing layers to do this for you. In this case, you will use the [Rescaling](https://www.tensorflow.org/api_docs/python/tf/keras/layers/Rescaling) layer. Here, you will pass in the scale you want to apply to the original pixels."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "cJgIym8pDc-1"
   },
   "outputs": [],
   "source": [
    "rescale_layer = tf.keras.layers.Rescaling(scale=1./255)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "YC1c3NTgDQ6y"
   },
   "source": [
    "You can feed the image array you used earlier to verify that it works."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "TeCMbzqxC93B"
   },
   "outputs": [],
   "source": [
    "image_scaled = rescale_layer(image_batch[0]).numpy()\n",
    "\n",
    "print(image_scaled)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "N2aVSKlMD4t_"
   },
   "outputs": [],
   "source": [
    "print(f'max value: {np.max(image_scaled)}')\n",
    "print(f'min value: {np.min(image_scaled)}')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "QA_mmiivEPdl"
   },
   "source": [
    "There are a couple of ways to apply this transformation to the entire dataset and you will try one of them here (you will see the other approach in another course). You will develop a function that does the rescaling and pass it to the [`map()`](https://www.tensorflow.org/api_docs/python/tf/data/Dataset#map) method.\n",
    "\n",
    "You know that the dataset comes in `(image, label)` pairs so your function needs to accept those parameters and also output the same pair but with the image rescaled. You will do that below using a [`lambda function`](https://www.learnpython.org/en/Lambda_functions). There's an equivalent approach commented out if you want to declare a separate function."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "ELFt3oGwJ3KS"
   },
   "outputs": [],
   "source": [
    "# Rescale the image using a lambda function\n",
    "train_dataset_scaled = train_dataset.map(lambda image, label: (rescale_layer(image), label))\n",
    "\n",
    "\n",
    "# # Same result as above but without using a lambda function\n",
    "# # define a function to rescale the image\n",
    "# def rescale_image(image, label):\n",
    "#     return rescale_layer(image), label\n",
    "\n",
    "# dataset_scaled = dataset.map(rescale_image)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "kEtdTd9IGAL1"
   },
   "source": [
    "As a sanity check, you can do a similar process from earlier and check the range of pixel values."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "Bo3KZk7FF_Sv"
   },
   "outputs": [],
   "source": [
    "# Get one batch of data\n",
    "sample_batch =  list(train_dataset_scaled.take(1))[0]\n",
    "\n",
    "# Get the image\n",
    "image_scaled = sample_batch[0][1].numpy()\n",
    "\n",
    "# Check the range of values for this image\n",
    "print(f'max value: {np.max(image_scaled)}')\n",
    "print(f'min value: {np.min(image_scaled)}')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "G78bN27bTPDc"
   },
   "source": [
    "Lastly, you will chain in a few more methods to configure the dataset:\n",
    "* [cache()](https://www.tensorflow.org/api_docs/python/tf/data/Dataset#cache) stores elements in memory as you use them so it will be faster to retrieve if you need them again\n",
    "* [shuffle()](https://www.tensorflow.org/api_docs/python/tf/data/Dataset#shuffle), as the name suggests, shuffles the dataset randomly. A `buffer_size` of 1000 means it will first select a sample from the first 1,000 elements, then keep filling this buffer until all elements have been selected.\n",
    "* [prefetch()](https://www.tensorflow.org/api_docs/python/tf/data/Dataset#prefetch) gets elements while the model is training so it's faster to feed in new data when the current training step is finished. A `buffer_size` set to `tf.data.AUTOTUNE` dynamically sets the number of elements to prefetch during runtime.\n",
    "\n",
    "`cache()` and `prefetch()` are particularly useful in speeding up the training. Try removing these later and you'll see that the training time per epoch will take about 3 to 4 times longer."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "8pGeL7HpE3xX"
   },
   "outputs": [],
   "source": [
    "SHUFFLE_BUFFER_SIZE = 1000\n",
    "PREFETCH_BUFFER_SIZE = tf.data.AUTOTUNE\n",
    "\n",
    "train_dataset_final = (train_dataset_scaled\n",
    "                       .cache()\n",
    "                       .shuffle(SHUFFLE_BUFFER_SIZE)\n",
    "                       .prefetch(PREFETCH_BUFFER_SIZE)\n",
    "                      )"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "mu3Jdwkjwax4"
   },
   "source": [
    "### Training\n",
    "\n",
    "You can start training for 15 epochs. Do note the values per epoch.\n",
    "\n",
    "The `loss` and `accuracy` are great indicators of progress in training. `loss` measures how far the current model prediction is from the known labels. `accuracy`, on the other hand, is the portion of correct guesses."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "Fb1_lgobv81m",
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "history = model.fit(\n",
    "    train_dataset_final,\n",
    "    epochs=15,\n",
    "    verbose=2\n",
    "    )"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now you can plot the evolution of the accuracy over every epoch of training:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Plot the training accuracy for each epoch\n",
    "\n",
    "acc = history.history['accuracy']\n",
    "\n",
    "epochs = range(len(acc))\n",
    "\n",
    "plt.plot(epochs, acc, 'r', label='Training accuracy')\n",
    "plt.title('Training accuracy')\n",
    "plt.legend(loc=0)\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "As expected, accuracy improves over time! Nice!"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "o6vSHzPR2ghH"
   },
   "source": [
    "### Model Prediction\n",
    "\n",
    "Now take a look at actually running a prediction using the model. This code will allow you to choose 1 or more files from your file system, upload them, and run them through the model, giving an indication of whether the object is a horse or a human."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "DoWp43WxJDNT"
   },
   "outputs": [],
   "source": [
    "# Create the widget and take care of the display\n",
    "uploader = widgets.FileUpload(accept=\"image/*\", multiple=True)\n",
    "display(uploader)\n",
    "out = widgets.Output()\n",
    "display(out)\n",
    "\n",
    "def file_predict(filename, file, out):\n",
    "    \"\"\" A function for creating the prediction and printing the output.\"\"\"\n",
    "    image = tf.keras.utils.load_img(file, target_size=(300, 300))\n",
    "    image = tf.keras.utils.img_to_array(image)\n",
    "    image = rescale_layer(image)\n",
    "    image = np.expand_dims(image, axis=0)\n",
    "    \n",
    "    prediction = model.predict(image, verbose=0)[0][0]\n",
    "    \n",
    "    with out:\n",
    "        if prediction > 0.5:\n",
    "            print(filename + \" is a human\")\n",
    "        else:\n",
    "            print(filename + \" is a horse\")\n",
    "\n",
    "\n",
    "def on_upload_change(change):\n",
    "    \"\"\" A function for geting files from the widget and running the prediction.\"\"\"\n",
    "    # Get the newly uploaded file(s)\n",
    "    \n",
    "    items = change.new\n",
    "    for item in items: # Loop if there is more than one file uploaded  \n",
    "        file_jpgdata = BytesIO(item.content)\n",
    "        file_predict(item.name, file_jpgdata, out)\n",
    "\n",
    "# Run the interactive widget\n",
    "# Note: it may take a bit after you select the image to upload and process before you see the output.\n",
    "uploader.observe(on_upload_change, names='value')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "-8EHQyWGDvWz"
   },
   "source": [
    "### Visualizing Intermediate Representations\n",
    "\n",
    "To get a feel for what kind of features your CNN has learned, one fun thing to do is to visualize how an input gets transformed as it goes through the model.\n",
    "\n",
    "You can pick a random image from the training set, and then generate a figure where each row is the output of a layer, and each image in the row is a specific filter in that output feature map. Rerun this cell to generate intermediate representations for a variety of training images."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "-5tES8rXFjux"
   },
   "outputs": [],
   "source": [
    "# Define a new Model that will take an image as input, and will output\n",
    "# intermediate representations for all layers in the previous model after\n",
    "# the first.\n",
    "successive_outputs = [layer.output for layer in model.layers[1:]]\n",
    "visualization_model = tf.keras.models.Model(inputs = model.inputs, outputs = successive_outputs)\n",
    "\n",
    "# Prepare a random input image from the training set.\n",
    "horse_img_files = [os.path.join(train_horse_dir, f) for f in train_horse_names]\n",
    "human_img_files = [os.path.join(train_human_dir, f) for f in train_human_names]\n",
    "img_path = random.choice(horse_img_files + human_img_files)\n",
    "\n",
    "img = tf.keras.utils.load_img(img_path, target_size=(300, 300))  # this is a PIL image\n",
    "x = tf.keras.utils.img_to_array(img)  # Numpy array with shape (300, 300, 3)\n",
    "x = x.reshape((1,) + x.shape)  # Numpy array with shape (1, 300, 300, 3)\n",
    "\n",
    "# Scale by 1/255\n",
    "x = rescale_layer(x)\n",
    "\n",
    "# Run the image through the network, thus obtaining all\n",
    "# intermediate representations for this image.\n",
    "successive_feature_maps = visualization_model.predict(x, verbose=False)\n",
    "\n",
    "# These are the names of the layers, so you can have them as part of the plot\n",
    "layer_names = [layer.name for layer in model.layers[1:]]\n",
    "\n",
    "# Display the representations\n",
    "for layer_name, feature_map in zip(layer_names, successive_feature_maps):\n",
    "    if len(feature_map.shape) == 4:\n",
    "\n",
    "        # Just do this for the conv / maxpool layers, not the fully-connected layers\n",
    "        n_features = feature_map.shape[-1]  # number of features in feature map\n",
    "\n",
    "        # The feature map has shape (1, size, size, n_features)\n",
    "        size = feature_map.shape[1]\n",
    "\n",
    "        # Tile the images in this matrix\n",
    "        display_grid = np.zeros((size, size * n_features))\n",
    "        for i in range(n_features):\n",
    "            x = feature_map[0, :, :, i]\n",
    "            x -= x.mean()\n",
    "            x /= x.std()\n",
    "            x *= 64\n",
    "            x += 128\n",
    "            x = np.clip(x, 0, 255).astype('uint8')\n",
    "\n",
    "            # Tile each filter into this big horizontal grid\n",
    "            display_grid[:, i * size : (i + 1) * size] = x\n",
    "\n",
    "        # Display the grid\n",
    "        scale = 20. / n_features\n",
    "        plt.figure(figsize=(scale * n_features, scale))\n",
    "        plt.title(layer_name)\n",
    "        plt.grid(False)\n",
    "        plt.imshow(display_grid, aspect='auto', cmap='viridis')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "tuqK2arJL0wo"
   },
   "source": [
    "You can see above how the pixels highlighted turn to increasingly abstract and compact representations, especially at the bottom grid.\n",
    "\n",
    "The representations downstream start highlighting what the network pays attention to, and they show fewer and fewer features being \"activated\"; most are set to zero. This is called _representation sparsity_ and is a key feature of deep learning. These representations carry increasingly less information about the original pixels of the image, but increasingly refined information about the class of the image. You can think of a convnet (or a neural network in general) as an information distillation pipeline wherein each layer filters out the most useful features."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Wrap Up\n",
    "\n",
    "This concludes this first lab on image data preprocessing, and you will build upon the results here in the next lessons. Before doing so, run the cell below to free up resources for the next lab. You might see a pop-up about restarting the kernel afterwards. You can safely ignore it and just press `Ok`. You can then close this lab, then go back to the classroom for the next lecture. See you there!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Shutdown the kernel to free up resources. \n",
    "# Note: You can expect a pop-up when you run this cell. You can safely ignore that and just press `Ok`.\n",
    "\n",
    "from IPython import get_ipython\n",
    "\n",
    "k = get_ipython().kernel\n",
    "\n",
    "k.do_shutdown(restart=False)"
   ]
  }
 ],
 "metadata": {
  "accelerator": "GPU",
  "colab": {
   "private_outputs": true,
   "provenance": []
  },
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.0rc1"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
